Categorization and characterization of transcript-confirmed constitutively and alternatively spliced introns and exons from human.

نویسندگان

  • Francis Clark
  • T A Thanaraj
چکیده

By spliced alignment of human DNA and transcript sequence data we constructed a data set of transcript-confirmed exons and introns from 2793 genes, 796 of which (28%) were seen to have multiple isoforms. We find that over one-third of human exons can translate in more than one frame, and that this is highly correlated with G+C content. Introns containing adenosine at donor site position +3 (A3), rather than guanosine (G3), are more common in low G+C regions, while the converse is true in high G+C regions. These two classes of introns are shown to have distinct lengths, consensus sequences and correlations among splice signals, leading to the hypothesis that A3 donor sites are associated with exon definition, and G3 donor sites with intron definition. Minor classes of introns, including GC-AG, U12-type GT-AG, weak, and putative AG-dependant introns are identified and characterized. Cassette exons are more prevalent in low G+C regions, while exon isoforms are more prevalent in high G+C regions. Cassette exon events outnumber other alternative events, while exon isoform events involve truncation twice as often as extension, and occur at acceptor sites twice as often as at donor sites. Alternative splicing is usually associated with weak splice signals, and in a majority of cases, preserves the coding frame. The reported characteristics of constitutive and alternative splice signals, and the hypotheses offered regarding alternative splicing and genome organization, have important implications for experimental research into RNA processing. The 'AltExtron' data sets are available at http://www.bit.uq.edu.au/altExtron/ and http://www.ebi.ac.uk/~thanaraj/altExtron/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RASE: recognition of alternatively spliced exons in C.elegans

MOTIVATION Eukaryotic pre-mRNAs are spliced to form mature mRNA. Pre-mRNA alternative splicing greatly increases the complexity of gene expression. Estimates show that more than half of the human genes and at least one-third of the genes of less complex organisms, such as nematodes or flies, are alternatively spliced. In this work, we consider one major form of alternative splicing, namely the ...

متن کامل

Bacterial Expression and Functional Characterization of A Naturally Occurring Exon6-less Preprochymosin cDNA

Chymosin (Rennin EC 3.4.23.4), an aspartyl proteinase, is the major proteolytic enzyme in the fourthstomach of the unweaned calf, and it is formed by proteolytic activation of its zymogene, prochymosin.Following the cloning of synthesized cDNAs on mRNA pools extracted from the mucosa of the calf fourthstomach, we have identified an alternatively spliced form of preprochymosin ...

متن کامل

Intronic sequences flanking alternatively spliced exons are conserved between human and mouse.

Comparison of the sequences of mouse and human genomes revealed a surprising number of nonexonic, nonexpressed conserved sequences, for which no function could be assigned. To study the possible correlation between these conserved intronic sequences and alternative splicing regulation, we developed a method to identify exons that are alternatively spliced in both human and mouse. We compiled tw...

متن کامل

The architecture of pre-mRNAs affects mechanisms of splice-site pairing.

The exon/intron architecture of genes determines whether components of the spliceosome recognize splice sites across the intron or across the exon. Using in vitro splicing assays, we demonstrate that splice-site recognition across introns ceases when intron size is between 200 and 250 nucleotides. Beyond this threshold, splice sites are recognized across the exon. Splice-site recognition across...

متن کامل

Applying genetic programming to the prediction of alternative mRNA splice variants.

Genetic programming (GP) can be used to classify a given gene sequence as either constitutively or alternatively spliced. We describe the principles of GP and apply it to a well-defined data set of alternatively spliced genes. A feature matrix of sequence properties, such as nucleotide composition or exon length, was passed to the GP system "Discipulus." To test its performance we concentrated ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Human molecular genetics

دوره 11 4  شماره 

صفحات  -

تاریخ انتشار 2002